Using group testing in a two-phase epidemiologic design to identify the effects of a large number of antibody reactions on disease risk

Background The role of immunological responses to exposed bacteria on disease incidence is increasingly under investigation. With many bacterial species, and many potential antibody reactions to a particular species, the large number of assays required for this type of discovery can make it prohibitively expensive. We propose a two-phase group testing design to more efficiently screen numerous antibody effects in a case-control setting. Methods Phase 1 uses group testing to select antibodies that are differentially expressed between cases and controls. The selected antibodies go on to Phase 2 individual testing. Results We evaluate the two-phase group testing design through simulations and example data and find that it substantially reduces the number of assays required relative to standard case-control and group testing designs, while maintaining similar statistical properties. Conclusion The proposed two-phase group testing design can dramatically reduce the number of assays required, while providing comparable results to a case-control design. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01798-0.


Background
Group testing procedures have been used for disease screening and prevalence estimation since the early 1940s [1]. With group testing, rather than separately testing individual samples for a binary biological response, samples are pooled together into a group and a group assessment of positivity is determined. Two major uses of group testing are in disease status identification and prevalence estimation. For disease identification, the goal is to test samples in groups with the purpose of fully identifying all disease cases with the fewest numbers of tests [2 and references within]. A common strategy is to test a group that consists of combined samples and to only continue further if the group outcome is positive; otherwise, one would stop and conclude all samples in the group are disease negative. On the other hand, we only need the group outcomes (without necessarily individual identification) for prevalence estimation [3 and references within for a literature review]. Group testing designs have increasingly been used as a cost-effective alternative to individual testing in the biosciences [4]. This paper proposes a novel two-phase group testing design for identifying case-control differences among many antibodies in an epidemiologic setting.
In the first phase of the proposed design, the prevalence of antibody reactivity is estimated in cases and controls using only the combined sample results from group testing without individual retesting. Zhang et al. and references within investigate situations where retesting positive pools results in efficiency gains for prevalence estimation [5]. In a similar vein, we retest positive pooled samples with individual tests, but to reduce the number of tests, we do this only for antibodies with preliminary statistical evidence for a case-control difference. This new design is compared to a case-control design with individual testing on all antibodies and to a standard group testing design where positive pooled samples are retested for all antibodies without regard to examining preliminary case-control differences.
The role of immunological responses to exposed bacteria on disease incidence is increasingly under investigation. New technologies for identifying many antibody-specific reactions to particular bacterial exposures are being developed and used in epidemiologic settings [6]. With many (> 1,000) potential antibody reactions to a bacterial species, and multiple (> 15) potential species being examined in a single study, this analysis may be high dimensional (n > 15,000), and therefore may be prohibitively expensive.
This two-phase group testing design is motivated by a recent study focused on screening for case-control differences in Helicobacter pylori antibodies to better understand risk of gastric cancer [6]. Since this was the first study of this type, it focused on only one bacterial species (Helicobacter pylori), but with additional species, future studies may analyze over 15,000 antibodies. Our aim is to develop a design to minimize the number of serologic tests required in this type of setting. We propose a two-phased approach for the efficient detection of antibody case-control differences (with the goal of identifying potential target antibodies for further investigation) where group testing is used in the first phase to select a subset of antibodies with preliminary evidence for a casecontrol difference and individual samples are retested on positive pooled samples within the subset during the second phase. We show how to implement this approach, and through simulations, demonstrate the substantial reduction in the number of serologic tests required relative to a standard case-control design.

Methods
An analysis of the case-control study without group testing would require a direct comparison of the frequency of antibody-specific reactions between cases and controls across the large number of antibodies, as depicted in Fig. 1. These frequencies are usually based on thresholding a quantitative serological assay or can be directly assessed with a qualitative assay that is inherently dichotomous. The case-control analysis with 15,000 antibodies would require researchers to analyze 15,000 multiplied by the total study sample size in number of assays. The large number of assays required for a sufficiently powered study would make this approach infeasible. We propose a group testing strategy to substantially reduce the number of required assays (tests) without sacrificing much power.
Our inferential goal is to test for case-control differences for each antibody where we control the point-wise error rate (e.g., each antibody-specific comparison between cases and controls has a type I error rate of α ). We recognize the number of antibodies is large and that we would expect an average number of false discoveries of α multiplied by the number of antibodies.
We propose a two-phase design where in the first phase we screen antibodies using group testing and only proceed to a second stage when there is a good indication of an effect. We describe the procedure as follows.

Phase 1
In phase 1, we use group testing to estimate the prevalence of antibodies, compare the prevalence estimates between cases and controls, and use this comparison to select antibodies. We split the observations by case-control status into groups of equal size. We then test each group for each antibody. We estimate the case and control prevalences for each antibody using the Burrows estimator [7,8] and references within. This estimator is given by where x is the number of positive groups, k is the group size, and n is the number of groups.
Although prevalence can be estimated using maximumlikelihood (MLE), this estimator will be biased. An alternative estimator was proposed by Burrows that eliminates most of the bias. In addition, Burrows showed empirically that his estimator not only improves on the bias but yields a smaller mean-square error (MSE) than the MLE for all values of p considered (p ≤ 0.5) [7].
We compute a two-sided z-test for each antibody to evaluate evidence for a case-control difference, Z antibody = p case − p control var p case + var p control where p case and p control are Burrow's estimators for the cases and controls, respectively. The variances of these estimators are computed as We then use the two-sided p-value from the calculated z-statistic to determine whether there is enough evidence of a difference for that antibody to advance to phase 2 individual testing. If the p-value is less than the phase 1 cutoff (c 1 ), we conduct individual testing; if it is greater, we assume there is no effect.

Phase 2
For those antibodies that proceed to Phase 2, we conduct a Fisher's exact test and conclude there is a case-control difference if the resulting p-value is less than cutoff c 2 .
The type I error rate of the final test is a function of both c 1 and c 2 . Therefore, given c 1 , we need to determine c 2 to control the final type I error rate at the nominal α level.

Calibration of c 2
We use a Monte-Carlo approach to compute c 2 as a function of the antibody prevalence, by applying the twophase design to data that was generated under the null hypothesis of no case-control effects, with 10,000 realizations for each prevalence value. Figure 2 illustrates how to choose the p-value used in phase 2 testing to achieve a final α level test. The figure shows the observed p-values in phase 2 testing under the null distribution, for an example antibody prevalence of 0.20. The 1 − α percentile of the resulting phase 2 p-values determines the cutoff value c 2 . Rather than applying the Monte-Carlo procedure for each of the large number of antibodies (e.g., 15,000), we evaluate c 2 as a function of prevalence by partitioning prevalence in units ranging from 0 to 1 by steps of size 0.01 (this requires only performing 100 Monte-Carlo simulations). Figure 3 shows the Monte-Carlo p-value cutoffs (c 2 ) as a function of prevalence, and these values were used for phase 2 testing. Noting that the resulting curve was not continuous, we also applied Lowess smoothing in order to construct a continuous curve of c 2 as a function of prevalence. However, smoothing the curve showed little differences in testing characteristics relative to simply interpolating between discrete sequence values so we used the non-continuous values for simplicity. We identified statistically significant antibody effects by comparing the p-value from the Fisher's exact test to c 2 .

Standard group testing approach
An alternative to the two-phase design described above uses group testing to reconstruct the complete data. In this design, group testing is applied to the entire dataset in the following manner: for groups that are negative, we assume all individuals in that group are negative and for groups that are positive, we retest individual samples to reconstruct the individual data on which standing Fisher's exact tests can be applied.

Simulation
We compare the proposed two-phase design with both a standard case-control and group testing design in terms of expected numbers of tests and statistical power. We generate data of 15,000 antibodies for 500 cases and 500 controls. The probability for a particular antibody j for individual i is given by With the resulting antibody probabilities ( mean p antibody ij = 0.1 ), we generate 15,000 antibody outcomes for each individual using a binomial distribution. The random effect b i incorporates an exchangeable correlation structure between antibody responses on the same individual.
The 0.73 in the above equation reflects the case-control differences on the probit scale for the first 200 antibodies. The remaining 14,800 antibodies have no case-control differences. In the following simulations we evaluate power based on the first 200 antibodies and type I error from the remaining antibodies.

Simulation results
The two-phase design requires investigators to specify c 1 . Choosing a value of c 1 too large (close to 1) results in the progression to phase 2 for a large number of antibodies which will lead to a large number of tests. On the other hand, choosing a value of c 1 that is too small will result in a small number of tests, but will have low power. As a compromise we chose c 1 = 0.3; later in the simulation we The comparison between the different designs is presented in Table 1. There is a large reduction in the expected number of tests with two-phase group testing relative to the case control design. The two-phase design with group size 5 has similar statistical properties (power and type I error rate) to the case-control design and uses only 32% of the tests. The standard group testing design with a group size of 5 has the same statistical properties as the case-control design while still reducing the number of tests but uses 58% of the tests used in the casecontrol design. For a larger group size (group size of 10), the two-phase design performs well (similar to a group size of 5), while the standard group testing design is less efficient.
We examined the sensitivity of the simulation results to the choice of c 1 at alternative values of 0.1, 0.2, 0.3, and 0.4 in Table 2. All choices resulted in substantial efficiency gain relative to the case-control design. Choosing c 1 at 0.2 or 0.3 appears to be a good balance between power and the expected numbers of tests.
For the antibody testing conducted in this epidemiologic setting, there is little evidence for dilution error in the range of group sizes we are considering. Particularly,   perfect sensitivity is expected. That said, we conducted a simulation study examining the properties of the proposed group testing method under losses of sensitivity. Table 3 shows the operating characteristics for a c 1 cutoff of 0.3 for a sensitivity of 0.95 and 0.98. The results are nearly indistinguishable from the case of perfect sensitivity shown in Table 2.

Example results
We analyzed the case-control study data described in the introduction (3,055 antibodies in 50 cases and 50 controls with group size 5) [6] using the case-control, standard group testing, and two-phase group testing designs. Antibody serology was normalized relative to the median raw expression values for all proteins on a given array and a value of 2 was chosen as the threshold for determining antibody positivity based on the experience of the laboratory [6].
We found that the case-control design identified four antibodies at the 0.05 significance level. The two-phase design identified the same four antibodies. With a small sample size, we would anticipate low power for identifying antibody effects. In practice, studies will have larger sample sizes. We evaluated this by resampling a larger number of cases and controls from the original dataset (resampling with replacement from the original dataset, creating a dataset with 500 cases and 500 controls).
We investigated designs with group sizes of 5, 10, and 20. Results are shown in Table 4. Under the case control design, 642 of 3,055 antibodies are significant and 2,413 are not significant. Of the 642 antibodies that are significant under the case control design, 641 antibodies are significant under the two-phase design for a group size of 5; 635 and 621 are significant for group sizes of 10 and 20, respectively. Of the 2,413 antibodies that are not significant under the case control design, 2,400, 2,399, and 2,401 are not significant with a two-phase design with group sizes of 5, 10, and 20 respectively. Table 5 shows the expected number of tests under a two-phase design for different group sizes. The two-phase design is substantially more efficient with respect to the expected number of tests as compared with the case-control design. The case control design uses 3,055,000 tests, while the two-phase group testing design with group size Table 4 Concordance of Antibody Identification Among Designs when Applied to Example Data. Results of implementing the designs on resampled example data, comparing the case control design and two-phase group testing design with group sizes 5, 10, and 20. Note that the standard group testing design will identify the same significant antibodies as the case control design, so results are not explicitly listed for simplicity